AITopics | transfer attack

Collaborating Authors

transfer attack

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

TransferBench: Benchmarking Ensemble-based Black-box Transfer Attacks

Neural Information Processing SystemsJun-22-2026, 19:11:46 GMT

Ensemble-based black-box transfer attacks optimize adversarial examples on a set of surrogate models, claiming to reach high success rates by querying the (unknown) target model only a few times. In this work, we show that prior evaluations are systematically biased, as such methods are tested only under overly optimistic scenarios, without considering (i) how the choice of surrogate models influences transferability, (ii) how they perform against robust target models, and (iii) whether querying the target to refine the attack is really required. To address these gaps, we introduce TransferBench, a framework for evaluating ensemble-based black-box transfer attacks under more realistic and challenging scenarios than prior work. Our framework considers 17 distinct settings on CIFAR-10 and ImageNet, including diverse surrogate-target combinations, robust targets, and comparisons to baseline methods that do not use any query-based refinement mechanism. Our findings reveal that existing methods fail to generalize to more challenging scenarios, and that query-based refinement offers little to no benefit, contradicting prior claims. These results highlight that building reliable and query-efficient black-box transfer attacks remains an open challenge.

data mining, machine learning, target model, (22 more...)

Neural Information Processing Systems

Country: Europe (1.00)

Genre: Research Report > Experimental Study (1.00)

Industry:

Transportation > Air (1.00)
Information Technology > Security & Privacy (1.00)
Government > Military (0.69)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

Consensus-Robust Transfer Attacks via Parameter and Representation Perturbations

Neural Information Processing SystemsJun-12-2026, 09:51:06 GMT

Adversarial examples crafted on one model often exhibit poor transferability to others, hindering their effectiveness in black-box settings. This limitation arises from two key factors: (i) \emph{decision-boundary variation} across models and (ii) \emph{representation drift} in feature space. We address these challenges through a new perspective that frames transferability for \emph{untargeted attacks} as a \emph{consensus-robust optimization} problem: adversarial perturbations should remain effective across a neighborhood of plausible target models. To model this uncertainty, we introduce two complementary perturbation channels: a \emph{parameter channel}, capturing boundary shifts via weight perturbations, and a \emph{representation channel}, addressing feature drift via stochastic blending of clean and adversarial activations. We then propose \emph{CORTA} (COnsensus--Robust Transfer Attack), a lightweight attack instantiated from this robust formulation using two first-order strategies: (i) sensitivity regularization based on the squared Frobenius norm of logits' Jacobian with respect to weights, and (ii) Monte Carlo sampling for blended feature representations. Our theoretical analysis provides a certified lower bound linking these approximations to the robust objective. Extensive experiments on CIFAR-100 and ImageNet show that CORTA significantly outperforms state-of-the-art transfer-based methods---including ensemble approaches---across CNN and Vision Transformer targets. Notably, CORTA achieves a \emph{19.1 percentage-point gain in transfer success rate over the best prior method} while using only a single surrogate model.

artificial intelligence, machine learning, proceedings, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.76)

Add feedback

b6fa3ed9624c184bd73e435123bd576a-Paper-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 17:35:52 GMT

artificial intelligence, deep learning, machine learning, (15 more...)

Neural Information Processing Systems

Country: Asia > China > Hong Kong (0.04)

Genre:

Research Report > Promising Solution (0.47)
Research Report > New Finding (0.46)

Industry: Information Technology (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Sensing and Signal Processing > Image Processing (0.69)

Add feedback

AppendixofSynergy-of-experts 1 TheoreticalProofs

Neural Information Processing SystemsFeb-12-2026, 02:15:40 GMT

From Figure 1(a), learning multiple linear sub-models and averaging the predictions (ensemble) is still a linear model, so it cannot tackleXOR problem. We compare the training cost of all methods from the two aspects;1). Thesub-model training enables themost adversarial attacks ofsub-models could be successfully defended. In particular, we train two kinds of models to defend against the attacks: 1). FromFigure2(a)and2(b),when0.01 ϵ 0.04, SoE without the collaboration training achieves a similar robustness compared with SoE.

adversarial sample, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.35)

Add feedback

eefc7bfe8fd6e2c8c01aa6ca7b1aab1a-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 00:48:50 GMT

blackbox model, fda, whitebox model, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.36)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Industry:

Information Technology > Security & Privacy (0.68)
Government > Military (0.68)
Government > Regional Government > North America Government > United States Government (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

AdversariallyRobust3DPointCloudRecognition UsingSelf-Supervisions SupplementaryMaterials

Neural Information Processing SystemsFeb-9-2026, 14:45:33 GMT

In this section, we introduce our implementation details of the adopted model architectures and self-supervisedlearningtasks. The EdgeConv layers are stacked to form the DGCNN backbone. As introduced in 2.2,we choose k = 3,4 in this task. We follow the attack setups in [13] to formulate our attack. Weprovide insights onhow different components contribute to the overall improvements.

artificial intelligence, arxivpreprintarxiv, machine learning, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

3ad7c2ebb96fcba7cda0cf54a2e802f5-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 03:36:40 GMT

Adversarial training, as a general robustness improvement technique, eliminates the vulnerability in a single model by forcing it to learn robust features. The process is hard, often requires models with large capacity, andsuffersfrom significant lossonclean dataaccuracy.

artificial intelligence, arxivpreprintarxiv, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

Blurred-Dilated Method for Adversarial Attacks

Neural Information Processing SystemsDec-26-2025, 14:52:29 GMT

Deep neural networks (DNNs) are vulnerable to adversarial attacks, which lead to incorrect predictions. In black-box settings, transfer attacks can be conveniently used to generate adversarial examples. However, such examples tend to overfit the specific architecture and feature representations of the source model, resulting in poor attack performance against other target models. To overcome this drawback, we propose a novel model modification-based transfer attack: Blurred-Dilated method (BD) in this paper. In summary, BD works by reducing downsampling while introducing BlurPool and dilated convolutions in the source model.

blurred-dilated method, name change, source model, (7 more...)

Neural Information Processing Systems

Industry:

Information Technology > Security & Privacy (0.67)
Government > Military (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.60)

Add feedback

DVERGE: Diversifying Vulnerabilities for Enhanced Robust Generation of Ensembles

Neural Information Processing SystemsDec-23-2025, 23:13:17 GMT

Recent research finds CNN models for image classification demonstrate overlapped adversarial vulnerabilities: adversarial attacks can mislead CNN models with small perturbations, which can effectively transfer between different models trained on the same dataset. Adversarial training, as a general robustness improvement technique, eliminates the vulnerability in a single model by forcing it to learn robust features. The process is hard, often requires models with large capacity, and suffers from significant loss on clean data accuracy. Alternatively, ensemble methods are proposed to induce sub-models with diverse outputs against a transfer adversarial example, making the ensemble robust against transfer attacks even if each sub-model is individually non-robust. Only small clean accuracy drop is observed in the process. However, previous ensemble training methods are not efficacious in inducing such diversity and thus ineffective on reaching robust ensemble. We propose DVERGE, which isolates the adversarial vulnerability in each sub-model by distilling non-robust features, and diversifies the adversarial vulnerability to induce diverse outputs against a transfer attack. The novel diversity metric and training procedure enables DVERGE to achieve higher robustness against transfer attacks comparing to previous ensemble methods, and enables the improved robustness when more sub-models are added to the ensemble.

diversifying vulnerability, dverge, enhanced robust generation, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.62)

Add feedback

Text Prompt Injection of Vision Language Models

Zhu, Ruizhe

arXiv.org Artificial IntelligenceOct-14-2025

The widespread application of large vision language models has significantly raised safety concerns. In this project, we investigate text prompt injection, a simple yet effective method to mislead these models. We developed an algorithm for this type of attack and demonstrated its effectiveness and efficiency through experiments. Compared to other attack methods, our approach is particularly effective for large models without high demand for computational resources.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2510.09849

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.69)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)

Add feedback